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Abstract 

The classical EMD algorithm has been used extensively in the literature to decompose signals that 
contain nonlinear waves. However when a signal contain two or more frequencies that are close to one 
another the decomposition might fail. In this paper we propose a new formulation of this algorithm 
which is based on the zero crossings of the signal and show that it performs well even when the classical 
algorithm fail. We address also the filtering properties and convergence rate of the new algorithm versus 
the classical EMD algorithm. These properties are compared then to those of the principal component 
algorithm (PCA). Finally we apply this algorithm to the detection of gravity waves in the atmosphere. 
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1 Introduction 



In scientific literature there exist many classical sets of functions which can decompose a signal in terms 
of "simple" functions. For example Taylor or Fourier expansions are used routinely in scientific and 
engineering applications. (and many other exist). However in all these expansions the underlying functions 
are not intrinsic to the signal itself and a precise approximation to the original signal might require a large 
number of terms. This problem become even more acute when the signal is non- stationary and the process 
it represents is nonlinear. 

To overcome this problem many researchers used in the past the "principal component algorithm" 
(PCA) to come up with an "adaptive" set of functions which approximate a given signal. A new ap- 
proach to this problem emerged in the late 1990's when a NASA team has developed the "Empirical Mode 
Decomposition" algorithm(EMD) which attempt to decompose a signal in terms of it "intrinsic mode 
functions "(IMF) through a "sifting algorithm". A patent for this algorithm has been issued [1]. 
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The EMD algorithm is based on the following quote [2]: "According to Drazin the first step of data 
analysis is to examine the data by eye. From this examination, one can immediately identify the different 
scales directly in two ways: by the time lapse between successive alterations of local maxima and minima 
and by the time lapse between the successive zero crossings.... We have decided to to adopt the time lapse 
between successive extrema as the definition of the time scale for the intrinsic oscillatory mode" 

A step by step description of the EMD sifting algorithm is as follows: 

1. Let be given a function f{t) which is sampled at discrete times {t^, k = 1, . . . n}. 

2. Mho{k)^f{tk). 

3. Identify the max and min of /z-o(^)- 

4. Create the cubic spline curve that connects the maxima points. Do the same for the minima M„. 
This creates an envelope for ho{k). 

5. At each time evaluate the mean rrik of Mj. and M„ (m^ is referred to as the sifting function). 

6. Evaluate hi{k) — ho{k) — nik. 

7. If norm of 1 1 /to — 1 1 < e for some predetermined e set the first intrinsic function IMFi = hi (and 
stop). 

8. if the criteria of (7) are not satisfied set ho^k) = hi{k) and return to (3) ("Sifting process"). 

The algorithm has been applied successfully in various physical applications. However as has been 
observed by Flandrin [3] and others the EMD algorithm fails in many cases where the data contains two 
or more frequencies which are close to each other. 

To overcome this difficulty we propose hereby a modification of the EMD algorithm by replacing steps 
4 and 5 in the description above by the following: 

4. find the midpoints between two consecutive maxima and minima and let A^^ be the values of ho at 
these points. 

5. Create the spline curve nik that connects the points A^^. 

The essence of this modification is the replacement of the mean which is evaluated by the EMD algo- 
rithm as the average of the max-min envelopes by the spline curve of the mid-points between the maxima 
and minima. This is in line with the observation by Drazin (which was referred to above) that the scales 
inherent to the data can be educed either from the max-min or its zero crossing. In the algorithm we 
propose hereby we mimic the "zero-crossings" by the mid-points between the max-min. 

It is our objective in this paper to justify this modification of the EMD algorithm through some exam- 
ples and theoretical work. The plan of the paper is as follows: In Sec. 2 we provides examples of signals 
composed two or three close frequencies (with and without noise) where the classical EMD algorithm 
fails but the modified one yields satisfactory results. In Sec. 3 we carry out analytical analysis of the two 
algorithms which are applied to the same signal. In Sec. 4 we discuss the convergence rate, resolution and 
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related issues concerning the classical and new "midpoint algorithm" . Sec. 5 address the application of 
this algorithm to atmospheric data and in Sec. 6 we compare the EMD and PCA algorithms 

2 Examples and Comparisons 

Extensive experimentations were made to test and verify the efficiency of the modified algorithm. We 
present here the results of one of these tests in which the signal contains three close frequencies. (In our 
tests we considered also the effects of noise and phase shifts among the different frequencies) 

f(t) = ■^[cos(cJit) + cos(a;2t) + cos(w3t)] (1) 

where 

TT 

Ui = 12uo, UJ2 = lOWo, ^^3 = 8^0, UJQ = —. 

zoo 

To apply the EMD algorithm to this signal, we used a discrete representation of it over the interval 
[-2048, 2048] by letting tk+i - tk = I, k = 1, . . . , 4097. 

The results of the signal decompositions into IMFs and a comparison these IMFs with the frequencies 
present in the original signal are presented in figures 1 — 5. In all these figures the red lines represent the 
frequencies in the original signal (or its power spectrum) and the blue lines the corresponding intrinsic 
mode functions or their power spectrum which were obtained by the midpoint algorithm. 

Fig. 1 is a plot of the data for the signal described by (H). Fig. 2 represents the first IMF in the 
decomposition (versus the leading frequency in the data) while Figs. 3 — 5 depict the spectral density 
distribution for the first three IMFs versus those related to the original frequencies in the data. It should 
be observed that although the amplitude of the spectral densities in these plots are different (especially for 
IMF 3) the maxima of the spectral density in each plot is very close to the original one. 

The EMD algorithm is a high pass filter. For the n — th iteration of the filter its efficiency is measured 
by the parameter a which is defined by 

Yn = anYn-1 + a{Xn - Xn^l) 

where Xk and Yk are the input and output of the k — th iteration. Fig 6 present the value of the parameter 
a as a function of the iteration number for first IMF derived from the data of the signal in ©. 

3 Some Analytical Insights 

To obtain analytical insights about the performance of the EMD-midpoint algorithm we considered the 
following signal 

/(^) = 7;[^Os{u4^t) + COs(w5t)], UJ4 = —, UJ5 = —. (1) 
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Since the ratio of the frequencies in this signal is a rational number the signal is actually periodic with 
period p = 128 (See Fig. 7) and the behavior of the classical versus the mid-point algorithm can be 
delineated analytically (i.e without discretizations). 

On the interval [0,p] the extrema of the signal are given by ^ = and therefore it is easy to construct 
the spline approximation Smax{t), Smin{t) to the maximum and minimum points and compute their aver- 
age. Similarly we can find the midpoints between the maxima and minima and evaluate the corresponding 
spline approximation Smidit) to the signal at these points, after one iteration of the sifting process the 
"sifted signal" is given respectively by 

hmnit) = m - (^) + ^— (2) 

and 

hmidit) = fit)- S'w(t). (3) 

The efficiency of the two algorithm can be deduced by projecting these new signals on the Fourier com- 
ponents of the original signal. To this end we compute 

p rp 

hmn{t)cOs{u4)dt, bmn = / hmnit) Sm{i04t)dt. (4) 

Jo 

p rp 

Knnit) COs{u5t)dt, dmn = / h„mit) Sm{uJ5t)dt. (5) 

Jo Jo 
and 

rp rp 

amid = / hmidit) cos{u4t)dt, hmid = / ^mid(t) sm{ujr,t)dt. (6) 
Jo Jo 
rp rp 
Cmid= / hmidit) cos{uj4t)dt, dmid= / hmidit) sm{uj5t)dt. (7) 
Jo Jo 
The amplitude of the Fourier components of the two frequencies in the classical EMD algorithm is 

^mn = \/amn ~^ ^"mni J^mn = \/ ~^ dmn- (^) 

Similarly for the mid-point algorithm we 



Amid — yO-mid + ^mid' ^rnid — y Cmid + ^mid" (9) 

The objective of the sifting process is to eliminate one of the Fourier components in favor of the other. As 
a result the first IMF will contains, upon convergence, only one of the Fourier components in the original 
signal. Therefore the efficiency of the two algorithm can be inferred by comparing Amn versus Bmn and 
Amid versus Bmid- Computing the integrals that appear in eqs.(|4])-(|7) we obtain 

Amn = 31.63346911, Bmn = 29.70292046, (10) 

Aw = 34.19647843, 5^ = 20.81145369. (11) 

These results show that after one iteration the classical EMD did not separate the two frequencies effec- 
tively. On the other hand the mid-point algorithm performed well. 
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4 Convergence Rates 



To compare the convergence rates of the classical versus the midpoint algorithm we considered three cases 
all of which were composed of two frequencies. In the first case the two frequencies were well separated. 
In the second case the two frequencies were close while in the third case they were almost "overlapping". 
In all cases the signal was given by 

f{t) = -{cOS(x!it + C0SU2t) 

This signal was discretized on the interval [—2048, 2048] with At = 1. 
For the first case the two frequencies were 

TT 

(jji = 12a;, u!2 = Scj, u 



256 



As can be expected both the classical and midpoint algorithm were able to discern the individual frequen- 
cies through the sifting algorithm. However it took the classical algorithm 59 iterations to converge to 
the first IMF. On the other hand the midpoint algorithm converged in only 7 iterations (using the same 
convergence criteria). We wish to point out also that the midpoint algorithm has a lower computational 
cost than the classical algorithm. It requires in each iteration the computation of only one spline interpo- 
lating polynomial. On the other hand the classical algorithm requires two such polynomials, one for the 
maximum points and one for the minimum points. 
For the second test the frequencies were 

TT TT TV TT 
OOl — 1 ■ = 

24 288 24 288 

that is the difference between the two frequencies is 

In this case the midpoint algorithm was able to separate the two frequencies. Fig 8 and Fig 9 compare 
the power spectrum of the original frequencies versus those of IMFi and IMF2 which were obtained 
through this algorithm. Convergence to IMFi was obtained in 18 iterations and IMF2 was obtained by 7 
additional iterations. 

The classical EMD algorithm did converge to IM Fi in 45 iterations but the power spectrum of this 
IMF deviated significantly from the first frequency in the signal(See Fig 10). IMF2 failed (completely) 
to detected correctly the second frequency. 

In third case the frequencies were 

TT TT TT TT 



24 1000' 24 1000 

In this case the classical algorithm was unable to separate the two frequencies i.e IMFi contained both 
frequencies (See Fig 11). The midpoint algorithm did somewhat better but the resolution was not complete 
(See Fig 12). Moreover the sifting process in both cases led to the creation of "ghost frequencies" which 
were not present in the original signal. 
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At this juncture one might wonder if a "hybrid algorithm" whereby the sifting function is the average 
(or some similar combination) of those obtained by the classical and midpoint algorithms might outper- 
form the separate algorithms (in spite of the obvious additional computational cost). However our exper- 
imentations with such algorithm did not yield the desired results (i.e. the convergence rate and resolution 
did not improve). 

5 Applications to Atmospheric Data 

There have been recent interest in the observation and properties of gravity waves which are generated 

when wind is blowing over terrain. In part this interest stems from the fact that these waves carry energy 
and accurate measure of this data is needed to improve the performance of numerical weather prediction 
models. 

As part of this scientific campaign the USAF flew several balloons that collected information about the 
pressure and temperature as a function of height. The temperature data collected by one of these balloons 

is presented in Fig. 13 [6]. To analyze this signal we detrended first it by subtracting its mean from the 
data. When the mid-point EMD algorithm was applied to this detrended- signal the first IMF extracted the 
experimental noise from while the second and third IMFs educed clearly the gravity waves (the second 
IMF is depicted in Fig. 14). On the other hand the classical EMD algorithm failed to educe these waves 
from the detrended-signal. 

Subtracting the gravity waves that were detected by the mid-point algorithm from the detrended-signal 
we obtain the "turbulent residuals" whose spectrum is shown in Fig 15. The slope of this signal in the 
"inertial frequency range" is —2.7 which corresponds well with the fact that the flow in stratosphere is 
"quasi two-dimensional" [7-9]. 

6 EMD or PCA- A Comparison 

Before the emergence of the EMD algorithm an adaptive data analysis was provided by the "Principal 

Component Algorithm "(PCA) which is referred to also as the "Karahunan-Loeve (K-L) decomposition 
algorithm". (For a review see [10]) Here we shall give only a brief overview of this algorithm within in 
the geophysical context. 

Let a signal be represented by a a time series X (of length N) of some variable.We first determine a 
time delay A for which the points in the series are decorrelated. Using A we create n copies of the original 
series 

X{k), X(ci + A),...,X(A'; + (n- 1)A). 
(To create these one uses either periodicity or choose to consider shorter time-series). Then one computes 
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the auto-covariance matrix R = (Rij) 



N 



R.. = J2x{k + iA)X{k + jA). 



(1) 



k=l 



Let Ao > Ai, . . . , > A„_i be the eigenvalues of R with their corresponding eigenvectors 



(0; 




z = 



0,...,n-l. 



The original time series X can be reconstructed then as 



n— 1 



(2) 



k=0 



where 




(3) 



The essence of the PCA is based on the recognition that if a large spectral gap exists after the first mi 
eigenvalues of R then one can reconstruct the mean flow (or the large component ( of the data by using 
only the first mi eigenfunctions in Q. A recent refinement of this procedure due to Ghil et al ([10]) is that 
the data corresponding to eigenvalues between mi + 1 and up to the point m2 where they start to form a 
"continuum" represent waves. The location of m2 can be ascertained further by applying the tests devised 
by Axford [11] and Dewan [7]. 

Thus the original data can be decomposed into mean flow, waves and residuals (i.e. data corresponding 
to eigenvalues m2 + 1, . . . , n — 1 which we wish to interpret at least partly as turbulent residuals). 

The crucial step in this algorithm is the determination of the points mi and m2 whose position has to 
ascertained by additional tests whose results might be equivocal. 

We applied this algorithm to the geophysical data described in Sec. 5.1 with A = 96 and computed 
the resulting spectrum of the correlation matrix R. This spectrum is depicted in Fig. 16 . Based on this 
spectrum we choose mi = 3 and m2 = 11 we obtain the corresponding wave component of the signal that 
is shown in Fig. 17. 

We conclude that while the PCA algorithm provides an alternative to the EMD algorithm the determi- 
nation of the cutoff points is murky in many cases. However it will be advantageous if one apply the two 
algorithms in tandem in order to obtain a clear cut confirmation of the results. 
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High Pass Filter Parameter a -two close frequencies 
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Power Spectrum of IMF 1 vs. first frequency 
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Power Spectrum of IMF 2 vs. 2nd frequency 

12 1 1 1 1 1 1 1 1 — 




2 4 6 8 10 12 14 16 18 20 

frequncy(Hz) 



17 



Power Spectrum of IMF 1 vs. first frequency 
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Power Spectrum of IMF 1 vs. first frequency 
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Power Spectrum of IMF 1 vs. first frequency 
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